Surface and subsurface oceanographic features drive forage fish distributions and aggregations: Implications for prey availability to top predators in the US Northeast Shelf ecosystem

Abstract Forage fishes are a critical food web link in marine ecosystems, aggregating in a hierarchical patch structure over multiple spatial and temporal scales. Surface‐level forage fish aggregations (FFAs) represent a concentrated source of prey available to surface‐ and shallow‐foraging marine predators. Existing survey and analysis methods are often imperfect for studying forage fishes at scales appropriate to foraging predators, making it difficult to quantify predator–prey interactions. In many cases, general distributions of forage fish species are known; however, these may not represent surface‐level prey availability to predators. Likewise, we lack an understanding of the oceanographic drivers of spatial patterns of prey aggregation and availability or forage fish community patterns. Specifically, we applied Bayesian joint species distribution models to bottom trawl survey data to assess species‐ and community‐level forage fish distribution patterns across the US Northeast Continental Shelf (NES) ecosystem. Aerial digital surveys gathered data on surface FFAs at two project sites within the NES, which we used in a spatially explicit hierarchical Bayesian model to estimate the abundance and size of surface FFAs. We used these models to examine the oceanographic drivers of forage fish distributions and aggregations. Our results suggest that, in the NES, regions of high community species richness are spatially consistent with regions of high surface FFA abundance. Bathymetric depth drove both patterns, while subsurface features, such as mixed layer depth, primarily influenced aggregation behavior and surface features, such as sea surface temperature, sub‐mesoscale eddies, and fronts influenced forage fish diversity. In combination, these models help quantify the availability of forage fishes to marine predators and represent a novel application of spatial models to aerial digital survey data.


| INTRODUC TI ON
Prey availability, a function of the density of prey resources and their accessibility to predators, is an important factor affecting the abundance and distribution of marine species (Frederiksen et al., 2006).
Marine prey species are hierarchically organized over multiple spatial and temporal scales with individuals grouping to form cohesive aggregations (e.g., swarms, schools, or shoals) at fine scales (<1 km) and aggregations forming distinct organizational patterns at submesoscales (1-10 km) and mesoscales (10-1000 km) across the broader regional seascape (spatial extents >10,000 km 2 ; Fauchald et al., 2000;Russell et al., 1992;Steele, 1978). Frequently, marine prey distributions are described at coarser mesoscale resolutions, simplified as general occupancy (i.e., presence or absence) and/or integrated over the water column (Arkema et al., 2006;Ruckelshaus et al., 2008). These generalizations discount the patchy nature of prey availability at smaller scales, which many marine predators target within the broader prey distribution to increase foraging efficiency and success (Fauchald et al., 2000;Wellenreuther & Connell, 2002). Thus, while the broad-scale distribution of prey may set the limits of marine predator distribution, the timing and spatial patterns of prey aggregations determine realized prey availability, impacting the fine and submesoscale habitat use of predators.
Small, schooling pelagic forage fishes are a critical prey resource within marine food webs, linking primary production and zooplankton to upper trophic level predators, such as seabirds, seals, cetaceans, piscivorous fishes, and squids (Cury et al., 2000;Pikitch et al., 2012).
Forage fishes form large, dense aggregations in a hierarchical patch structure that varies over fine spatial and temporal scales (Freon & Misund, 1999;Pitcher, 1986). Although other groups such as squids and juvenile stages of some piscivorous fishes (e.g., age 0-1 groundfish), also exhibit schooling behavior and can serve a similar functional role, small pelagic forage fishes remain in this role throughout their life history and are the primary forage species in many marine ecosystems (Rountos, 2016). The formation and distribution of forage fish aggregations (FFAs) are driven by a combination of their responses to the physical abiotic environment (e.g., physiological thermal constraints) and responses based on biotic interactions (e.g., foraging, predator avoidance, and spawning; Genin, 2004;Pitcher, 1986). Surface-level FFAs, in particular, are important for surface or shallow-foraging predators (e.g., plunge-diving or dipping seabirds; Fauchald, 2009) and predators that trap aggregated forage fishes between themselves and the surface as a foraging strategy (e.g., cetaceans, sharks, and pursuitdiving seabirds), which often form multispecies feeding associations for efficient exploitation (Thiebault et al., 2016).
Differences in survey and sampling methods between forage fishes and their predators make it challenging to obtain prey availability data at behaviorally relevant scales to discern predator-prey relationships, resulting in a fundamental scale mismatch between predator and prey data (Benoit-Bird et al., 2013;Fauchald et al., 2000). For instance, bottom trawl surveys are routinely used in fisheries stock assessments to discern abundance and distribution of multiple fish species across broad seascape areas (Despres-Patanjo et al., 1988). Bottom trawl surveys are not optimal for sampling low to mid-trophic level pelagic forage fishes, since the gear has species-and size-dependent selectivity, and in deeper waters may only reliably sample pelagic, schooling forage fishes upon deployment and recovery as the net moves vertically through the water column. Nonetheless, while forage fishes are primarily mid-water species, they do use the full water column over the continental shelf via several mechanisms (i.e., diel vertical migration, predator avoidance, spawning, and over-wintering ;Freon & Misund, 1999). In the Northeast U.S. Continental Shelf ecosystem (NES), forage fishes are routinely captured in bottom trawls and the distribution of these captures is systematic, likely representing true broadscale distribution tendencies Roberts represent multiple species). In the absence of FFA data, many studies of marine predator distributions rely on oceanographic features, such as bathymetry, sea surface temperature (SST), and chlorophyll concentration, as proxies for prey availability (Becker et al., 2016;Palacios et al., 2014;Torres et al., 2008), assuming these features adequately represent prey patterns. For example, some seabird species have been associated with frontal features, which have been interpreted as an aggregating mechanism for prey (Scales, Miller, Embling, et al., 2014). While some of these oceanographic features are known to play a role in the general distribution of forage fish species (Friedland et al., 2019;Suca, Deroba, et al., 2021), we lack an understanding of the physical and biological mechanisms driving the formation and distribution of surface FFAs (Cox et al., 2018;Peck et al., 2021). In addition, data on relevant oceanographic variables, such as subsurface features, are often lacking at appropriate spatiotemporal scales for modeling dynamic, ephemeral processes such as aggregation formation, leading to an incomplete understanding of the oceanographic processes driving forage fish distribution and aggregation (Brodie et al., 2018;Mannocci et al., 2017).
Data from aerial digital surveys offer a novel opportunity to assess the oceanographic processes driving the abundance and size of FFAs and whether those differ from the drivers of forage fish occurrence distributions. Surface FFAs are a product of forage fish presence and a behavioral response. Thus, distributions of FFAs are not necessarily driven by the same environmental features as broadscale occurrence. Static habitat features (i.e., bathymetric depth and bottom topography) can interact with dynamic ocean processes, creating conditions that promote the formation of FFAs (Genin, 2004;Holland et al., 2021). For example, zooplankton in coastal waters can accumulate via currents and high chlorophyll-a along productivity fronts; shallow topography then prevents downward migration of zooplankton, driving increased surface FFAs which forage on the concentrated plankton (Holland et al., 2021). Subsurface dynamic processes, such as stratification, also influence aggregating mechanisms via concentrating nutrients, subsurface productivity, and zooplankton (Genin, 2004). Abrupt changes in bottom topography can interact with water column stratification to drive FFAs to the surface (Cox et al., 2018).
We used models of the forage fish community distribution alongside independent models of FFA distribution to address the following questions: To examine broad patterns in forage fish community dynamics, we modeled the joint distribution of 15 surface-aggregating forage fish species from the NES in autumn and spring. We used digital aerial survey data of surface FFAs from the New York and Mid-Atlantic Bights to model the spatial distribution of FFA abundance and size by season. To determine which oceanographic processes influence prey availability and broad-scale forage fish distribution, we included a set of environmental covariates representing three categories: static physical habitat (i.e., bathymetric measures), dynamic surface processes (i.e., temperature and chlorophyll fronts, eddies), and dynamic subsurface processes (i.e., stratification).
We predicted that different environmental processes would drive forage fish distributions and surface FFAs with dynamic features more influential to FFAs. We found that multiple models describing forage fish distributions at community and aggregation levels can provide a more complete picture of the conditions driving the broadscale distribution and aggregation behavior of forage fishes, and, thus, prey availability for surface-and shallow-foraging marine predators.

| Study region
This study has multiple nested regions, the largest being the NES, a well-studied marine ecosystem (Sherman & Skjoldal, 2002), encom-

| Data description
2.2.1 | Bottom trawl surveys-forage fish species/ community data The NOAA Northeast Fisheries Science Center has conducted a biannual fisheries-independent bottom trawl survey across the NES ecosystem for over 50 years (1968Grosslein, 1968, Appendix 1: Section 1). Bottom trawl surveys are conducted in the boreal autumn and spring, employing a random stratified survey design with strata based primarily on depth and secondarily on latitude (Despres-Patanjo et al., 1988). Within strata, tow locations are assigned randomly prior to each seasonal survey. A minimum of two locations are sampled per strata, totaling ~300 locations per season. The trawl net has a 12.5 mm mesh liner at the codend to retain juvenile and small-bodied fishes. We used tow data with catch identification at the species level for 15 pelagic, schooling forage fishes ( Table 1). Data were standardized using calibration factors to account for vessel and gear changes in the surveys during the time series (Miller et al., 2010). However, the bottom trawl gear was not F I G U R E 1 (a) US Northeast Continental Shelf (NES) study area, (b) New York (NY) Bight aerial digital survey transects, and (c) Mid-Atlantic Bight aerial digital survey transects. Model prediction extents are depicted via blue (NES) and orange (FFA) outlines. Relevant geographic features are labeled (see legend). FFA, forage fish aggregation; ME, Maine; OPA, offshore planning area; WEA, wind energy area. designed to capture pelagic forage fishes; thus, even after standardization for temporal gear changes, the abundance/biomass data may not be fully representative of forage fishes. Therefore, we transformed raw abundance/biomass per tow to binary occupancy data (presence/absence) for distribution modeling.

| Aerial digital surveys-forage fish aggregation data
High-resolution aerial digital surveys were conducted as baseline ecological studies of designated offshore planning areas (OPAs; Figure 1b,c) for wind energy development and were designed to estimate patterns of above-water and surface-level fauna.
Detectability of submerged FFAs in these surveys varies due to water turbidity and weather conditions with the estimated average vertical penetration of the water column being ~3 m and a maximum penetration under ideal conditions of ~8-9 m (Hodgson et al., 2017;Martin Scott, HiDef Aerial Surveying, Ltd., pers. comm). Therefore, this data, including subsequent analysis and interpretation, represents only surface FFAs. Observations of the number and size of surface FFAs were collected from two aerial digital survey projects 3 cm GSD, adjusted to only 2 cm GSD for the remainder of the study to increase image clarity and color rendition for improved species identification across all taxa (Hatch et al., 2013). High-density parallel transect surveys (1 km spacing) were conducted in each of the smaller WEAs, providing ~20% coverage, while the remainder of the OPA was surveyed via a sawtooth transect path with ~2% coverage.
For both projects, FFAs were identified from the transect image data using detection software and manual review methods followed by quality control (Buckland et al., 2012;Duron et al., 2015;Hatch et al., 2013;Normandeau Associates Inc., 2020). FFAs were identified as cohesive groups of similarly sized individuals with synchronous swimming behavior, where individuals within the group were indistinguishable due to group density and small body size. Species composition of FFAs was not identifiable due to submersion and small body size, but mackerel, menhaden, herring, and hickory shad are major schooling species in the New York Bight (Normandeau Associates Inc., 2020), while menhaden, mackerel, herring, bay anchovy, alewife, and blueback herring are frequent schoolers in the Mid-Atlantic . The vertical height of the FFAs TA B L E 1 Occurrence (no. of tows present) of surface schooling forage fishes from the bottom trawl surveys of the US Northeast Continental Shelf (NES) study area included in the community distribution models.

| Environmental data
We included a combination of static habitat features and dynamic oceanographic processes in our models as environmental covariates. Initially, we considered 28 environmental covariates (13 static, 14 dynamic), encompassing a range of surface and subsurface features obtained from publicly available oceanographic data sources (Appendix 1: Tables A1 and A2). Static habitat included bathymetric terrain measures (e.g., depth, slope, rugosity) and sediment grain size.
Dynamic covariates included remote-sensed, modeled, and derived data for surface and subsurface features (Appendix 1: Table A2).
We calculated the SST seasonal anomaly (hereafter, SST anomaly) by dividing each SST value by the seasonal SST mean across the FFA 4 × 4 km grid ( Figure 1a). SST and chlorophyll fronts and frontal metrics (i.e., Fprob: front persistence, Fmean: front intensity; Table 2) were derived from the raw SST and chlorophyll remote-sensed data products (see Appendix 1: Section 2 for detailed methods). All covariates were resampled with bilinear interpolation to a 4 × 4 km grid (spatially concurrent with the FFA 4 × 4 km grid) encompassing the NES study area. Dynamic covariates were used at a daily temporal resolution, matching the observation date.
Briefly, we describe the application of HMSC to the NOAA bottom trawl data (see Ovaskainen & Abrego, 2020 for a complete model description). Independently for autumn and spring, we modeled species occurrence as a function of the environmental covariates with a probit link regression mixed model. To account for residual variation not explained by the covariates, we specified two random effects: (1) a tow-level random effect to control for unexplained variance at the sampling level and (2) a temporally explicit random effect of year, modeled with an exponentially decaying covariance structure, to account for annual variation in occupancy. The residual species associations are taken from the species-to-species variance-covariance matrix derived via the random effects . We ran two alternative community models for both seasons, substituting the two Fmean covariates for the Fprob versions.
spring covariate set was simplified by dropping rugosity to achieve model convergence ( Table 2).
We fit the models using the default prior distributions  and sampled the posterior distribution with four Markov Chain Monte Carlo (MCMC) chains (see Appendix 2: Table A1 for MCMC sampling parameters). MCMC convergence was examined visually and by calculating the potential scale reduction factors (PSRF; i.e., R ) and effective sample size (ESS) of the alpha (spatial scale of the temporal random effect), beta (fixed effects), and omega (species associations per random effect) parameters (Brooks & Gelman, 1998;Gelman & Rubin, 1992). We considered the model adequately converged if the mean and median PSRF values for the alpha (factor 1), beta, and omega parameters were less than 1.1 and if the ESS values were more than 400 (Appendix 2: Figure A1; Vehtari et al., 2021).
To determine the best-performing model (Fprob or Fmean version) for autumn and spring, we compared the Watanabe-Akaike Information Criterion (WAIC) scores (Watanabe, 2013). For the best seasonal models, we evaluated the explanatory power, predictive accuracy, and conditional predictive accuracy by computing the area under the receiver operator curve (AUC; Pearce & Ferrier, 2000). Explanatory power was calculated using the model fitted to all data to get species-specific AUC values and a summarized mean AUC. For predictive accuracy, we performed a threefold cross-validation analysis, randomly assigning each year of sampling data to one of the three folds, and computing predictions for each fold (i.e., the testing data) based on the model fitted to the remaining two folds (i.e., the training data). The number of folds chosen was in keeping with best practices for HMSC analysis, while allowing for a reasonable computational time frame. For conditional predictive accuracy, we conducted conditional cross-validation to evaluate the importance of estimated species associations to model predictions, where the species-to-species variance-covariance matrix is employed along with the estimated covariate parameters to make predictions (for details see . Finally, to assess the contributions of the fixed and random covariates to model fit, we partitioned the species-specific explained variance between the environmental covariates and each of the random effects.

| Forage fish aggregation models
Using a hierarchical Bayesian framework, we developed a model to independently estimate abundance and size of FFA. The observed number of aggregations (y ij ) per 4 km grid cell i and survey j generally followed a zero-inflated negative binomial (ZINB) distribution due to overdispersion. We parameterized the ZINB using a zero-inflated Poisson-Gamma mixture formulation: where ij , is the dispersion parameter and z ij is the zero-inflation parameter, which models the realized FFA abundance, given the presence ij , following a Bernoulli distribution.
We incorporated the season (sea1 = autumn, sea2 = spring, sea3 = winter) of survey j as a fixed covariate (categorical) on the parameter ( ij ), using a logit-linear link: The log of expected mean abundance ( ij ) was modeled as the linear predictor, log ij , with survey effort (e ij ) added as an offset: We modeled the linear predictor log ij as a log-linear equation with a spatial random effect (u 1ijk ) at the 32 km grid cell (k) scale to account for spatial autocorrelation, and a suite (n = 10) of environmental covariates (l = 1, 2, … n; Table 2): The spatial random effect (u 1ijk ) was incorporated as a proper Gaussian conditional autoregressive (CAR) model (Banerjee et al., 2003;Besag et al., 1991) that accounts for spatial dependence of grid-level data. The CAR model follows a multivariate normal distribution parameterized in terms of covariance: where I is the identity matrix, C is the normalized weight matrix, and M is the diagonal matrix of conditional variances, CAR is the precision scalar and γ is the degree of spatial dependence.
We modeled the logged observed aggregation size (log s a ) for aggregation a as a Gaussian distribution: where Size a is the mean and Size a is the variance of log s a . The estimated aggregation size ( Size a ) was modeled using a linear equation, incorporating a second spatial CAR set up as above (u 2ak ) as a random effect and the same suite of environmental covariates (l = 1, 2, … n): For consistency and to facilitate comparisons, the FFA abundance and size models were fit with the SST and Chla Fprob frontal metrics to match the best-performing community model (see Section 3). In the FFA models, we chose SST seasonal anomaly and benthic position index (BPI), instead of SST and rugosity, respectively, due to multicollinearity at the FFA study scale. (1) We used the R package NIMBLE (version 0.

| RE SULTS
During the aerial digital survey study period (

| Forage fish community models
For both seasons, the Fprob model was a marginally better fit than the Fmean model as evaluated by the WAIC scores (Table 3); thus, further discussion of the Fmean models is not presented. The autumn and spring Fprob models showed a good fit to the data with mean AUC scores of explanatory power of >0.93 for both (  Figure A2). The explanatory power, as calculated from the correlations between the observed and predicted values for the full model, indicated an adequate fit for the abundance model (0.23 correlation) and a good fit for size model (0.59). Correlations were also consistent across the in-sample crossvalidation for both models (abundance: 0.23 ± 0.002; size: 0.59 ± 0.008; Table 4). The out-of-sample predictive accuracy (

| Environmental drivers
For the community models, the environmental covariates accounted for most of the explained variance (autumn mean: 75.0%; spring mean: 62.6%), followed by the tow-level random effect (au- spring; all spp; Figure 2). In autumn, depth had the third highest effect size (0.80; Appendix 2: Table A2a), while SST only had a moderate effect size (0.24). All species, except herring and saury, had a negative relationship with depth, so as depth increased the probability of occurrence decreased (Figure 2a). In autumn, the probability of occurrence increased as SST became warmer for seven species and decreased with warmer SST for six species; only menhaden had a

Model Bvp
Explanatory power In-sample predictive accuracy (mean ± SD) Note: Abundance and size models were evaluated using a posterior predictive check with a Freeman-Tukey Goodness-of-Fit test, giving the Bayesian p-value (bpv). The correlation between observed data and model estimates was calculated as a measure of explanatory power (full model) and predictive power (fivefold cross-validation: in-sample and out-of-sample). Pearson correlation was used for the abundance model, while Spearman was used for the size model due to a few extreme outliers in model estimates.
TA B L E 4 Fit statistics and cross validation results for the forage fish aggregation (FFA) models.

F I G U R E 2
Beta parameter estimates for the (a) autumn and (b) spring community models. Orange and blue grid squares represent a significant relationship (either positive or negative, respectively) between the probability of occupancy and each environmental covariate. White grid cells represent non-significant relationships. Beta parameters were considered significant if they had at least 95% posterior support. The top panels depict the community-level covariate effect size, calculated as the mean absolute parameter values.
non-significant relationship with SST ( Figure 2a). In spring, depth had Appendix 2: Table A2) but was still a significant driver for part of the community (autumn: 9 of 14 spp; spring: 5 of 10 spp; Figure 2).
In autumn, the dynamic surface features, FSLE (i.e., presence of submesoscale eddies and filaments) and Chl Fprob (i.e., productivity front persistence) had the highest effect size (1.87 and 0.96, respectively, Appendix 2: Table A2a) but were only significant drivers of occurrence for about half the community (7 spp each; Figure 2a). In the spring, the covariates with the highest effect size were FSLE (3.49) and SST Fprob (0.69; Figure 2b and Appendix 2: Table A2b). While FSLE was significant for seven species during spring, SST Fprob was only a significant driver for five species (Figure 2b). For most species, FSLE had a positive relationship F I G U R E 3 (a) Beta parameter estimates and credible intervals (CIs) from the forage fish aggregation (FFA) abundance model. Gray points represent the parameter medians, thick lines the 50% CI, and thin lines the 95% CI. Parameters with gray CIs are not significant. (b-f) FFA abundance predictions relative to the five strongest beta parameters. Shaded areas represent the 95% CI of the estimate.
with the probability of occurrence; only two species in the autumn (saury and silver anchovy) and one in the spring (butterfish) were negatively associated with FSLE. Chl Fprob had a mostly positive relationship with occurrence (5 of 7 spp); only alewife and mackerel showed the opposite relationship (Figure 2a). In the spring, SST Fprob had a mostly negative relationship with occurrence (4 of 5 spp; Figure 2b).
In both the FFA abundance and size models, the majority of environmental covariates had statistically significant effects (i.e., 95% CIs did not contain 0; Figures 3 and 4). The most important covariates for estimating abundance were MLD, depth, salinity, SST anomaly, and Chla ( Figure 3). As MLD and depth decreased, FFA abundance increased. Conversely, as salinity, SST anomaly, and Chla increased, so did FFA abundance. Although Chla had the lowest absolute parameter estimate of these five covariates, the predicted median abundance across the range of Chla in the study area was two orders of magnitude higher than median abundance across the range of depth, and four orders of magnitude higher than for MLD, salinity, and SST anomaly. Only Chl Fprob and FSLE were not significant for abundance.

F I G U R E 4 (a)
Beta parameter estimates and credible intervals (CI) from the forage fish aggregation (FFA) size model. Points represent the parameter medians, thick lines the 50% CI, and thin lines the 95% CI. The inset (gray box) enlarges the scale for parameters close to zero to improve readability. Parameters with gray CIs are not significant. (b-d) FFA size predictions relative to the three strongest beta parameters. Shaded areas represent the 95% CI of the estimate.
The most important covariates for estimating FFA size were BPI, depth, and MLD ( Figure 4). BPI had a strong negative effect on estimated FFA size but was most influential in the −80 to −40 range which corresponds to steep crevasses or valleys. Higher values of BPI corresponding to bathymetric flats (near 0) and steep hills or peaks (>1), were associated with very low FFA size. As depth increased, the estimated size of aggregations decreased. MLD had opposite effects on abundance versus size: as MLD increased (i.e., the water column became less stratified), there were fewer, but larger, FFAs. SST anomaly was the only non-significant covariate for aggregation size.

| Spatial and seasonal patterns
The predicted species richness (i.e., summed occupancy) from the community models was highest nearshore and in the Gulf of Maine in the autumn; in spring, species richness was lower overall and less variable across the study area ( Figure 5). In general, the predicted occurrence distributions of forage fishes in autumn, except for butterfish, were more concentrated either nearshore or in the Gulf of Maine ( Figure 6). Conversely, the spring-predicted occurrence distributions were more diffused across the shelf ( Figure 7).  (Figure 8). The areas of high FFA abundance were similar across all seasons, while the magnitude of abundance varied from highest in the summer to lowest in the winter. Across the mid-to-outer shelf in the FFA study area, there were consistently low counts (1-10) of FFAs even during the summer, and almost no FFAs predicted in those areas during the winter. Autumn had a higher abundance of predicted FFAs than spring, particularly nearshore.
In general, the predicted size of aggregations was more spatially uniform than abundance, with smaller FFAs predicted near the continental shelf break. The predicted size of aggregations was larger in the winter than in the other seasons, corresponding to less shelf stratification (i.e., deeper MLD). Since the predicted size was fairly constant across much of the shelf, the surface availability (abundance × size) followed similar spatial patterns to that of FFA abundance.

| Forage fish community types
From the k-means cluster analysis of the community occupancy predictions, we found six distinct forage fish community types in the NES study area for both seasons (Figures 9 and 10). For both seasons, Community Type 1 had the highest mean species richness (autumn: 5.79; spring: 3.37; Table 5) with 12 of 14 species in the autumn and 9 of 10 species in the spring having prevalence >10% (Figures 9 and 10, Appendix 2: Table A3). Community Type 2 had the second highest species richness (autumn: 2.41; spring: 1.56; Table 5). Herring dominated Communities 2 and 3 in the autumn, while in the spring, alewife was more prevalent. Community Types 4 and 5 were butterfish-dominated in the autumn ( Figure 9); whereas in the spring, Community Type 4 was not dominated by any species and Community Type 5 remained butterfish-dominated ( Figure 10).
During the autumn, Community Type 6 had the lowest species richness (0.28) with most forage fish species uncommon; while in the spring, Community Type 5 had the lowest species richness (1.03) and was butterfish-dominated.
Within the FFA study area, four community types were represented in the autumn, while during the spring, five were represented (Figures 9 and 10 and density (autumn: 9.11/km 2 ; spring: 0.70/km 2 ), and Community Type 6 had the lowest for both ( Table 5).

| Species co-occurrence
Species co-occurrence patterns, representing the residual speciesto-species associations from the random effects, showed correlations among species both temporally and by tow (Appendix 2: Figure A3). In the autumn, the temporal associations indicated that two groups of forage fishes fluctuate with each other over annual scales (Appendix 2: Figure A3a): (1) blueback herring, saury, alewife, thread herring, menhaden, mackerel, and herring; and (2) silver anchovy and Spanish sardine. Tow-level associations in the fall showed complex co-occurrence patterns (Appendix 2: Figure A3b). In the spring, there were few significant temporal or tow associations (Appendix 2: Figure A3c,d).

| DISCUSS ION
The community and aggregation patterns we found for NES forage

| Spatiotemporal patterns in forage fish availability
The predicted distribution of FFAs likely represents a more appropriate estimation of prey availability for surface-feeding predators, since modeling surface aggregations incorporates a measure of patchiness driven by schooling behavior and vertical accessibility. and Centropages typicus (Kane & Prezioso, 2008;Suca, Deroba, et al., 2021), distributions of predators (Bangley et al., 2020;Goyert et al., 2018;Roberts et al., 2016), and commercial fishing effort for some of these forage fishes, which often target large surface schools (VMS Commercial Fishing Density Data, www.north easto ceand ata. org; SEDAR, 2015).
Moreover, we found that areas of high predicted FFA abundance coincide with areas of high predicted species richness from the forage fish community models. In contrast, areas of high FFA abundance, or surface prey availability, do not always correlate with areas of high individual species occupancy. Based on this relationship, we would also expect high abundance of FFA in areas of high species richness in the NES outside the FFA study area (i.e., the Gulf of Maine, Nantucket Shoals, and Georges Banks). While we currently lack empirical data on FFA in these areas, they are known for high productivity and as important foraging areas for marine mammals and seabirds (Overholtz et al., 2007), suggesting that further study of the spatiotemporal patterns of FFA in these areas would inform our understanding of predator distributions and behaviors.

F I G U R E 8
Predicted spatial distribution of forage fish aggregation (FFA) abundance, size, and surface availability (abundance × size, cumulative m 2 ). The spatial extent of the size and availability predictions has been reduced to only include on-shelf areas. research is needed on the species composition of FFAs in these areas over time to validate these model predictions.

| Joint species distribution models reveal forage fish community dynamics
The residual species associations from the forage fish community models provide information on the potential for species interactions among forage fishes (i.e., intraguild interactions), showing patterns of co-occurrence across years or tows that are not explained by the species' respective environmental niches.
Given the high conditional model fit, especially for the autumn community model, it is possible that the temporal and tow associations identified represent true species interactions. The temporal associations indicate that groups of species are either linked (positive associations) or asynchronous (negative) over time, driven F I G U R E 1 0 Spring forage fish community types: six distinct community types were identified across the NES study area via k-means cluster analysis. Circular bar plots depict the prevalence (i.e., mean occurrence probability) of species within the community type (y-scale max = 1). Species codes are defined in Table 1. Species with a prevalence <0.003 are not shown. Black outlines define the forage fish aggregation study area. See Appendix 2: Table A3b for prevalence estimations. NES -U.S. Northeast Continental Shelf.
by long-term processes or behaviors for which our models do not account. The tow-level associations may indicate species that are spatially associated with each other, such as herring, alewife, and blueback herring, which form large, multispecies schools for foraging (Turner et al., 2016). However, there is a paucity of data on forage fish behavior in the NES ecosystem, and much is still unknown about interspecies interactions within this community. We must exercise caution when inferring species interactions based on residual associations, since they could also indicate that our models are missing environmental covariates that could explain some of this residual covariance. The potential existence of intraguild interactions suggested by our study indicates a need for more research to elucidate these relationships within the forage fish community, especially given evidence of asymmetrical distribution shifts of species in the NES due to climate-induced warming (Friedland et al., 2020;Hare et al., 2016;Kleisner et al., 2017). A better understanding of intraguild interactions, particularly regarding behaviors influencing the formation and composition of surface FFAs, could allow for more accurate estimations of FFA distribution and prey availability to predators across space and time.

| Community distribution
Submesoscale filaments (FSLE) and front persistence (Fprob; autumn: productivity; spring: SST), both dynamic surface features, were the most important environmental drivers for community distribution across seasons, but only for a subset of species.
Similarly, Suca, Deroba, et al. (2021) found that total kinetic energy, a proxy for mesoscale eddies, was an important driver for the distributions of sand lance, herring, alewife, and mackerel, suggesting, along with our results, that mesoscale and submesoscale eddies are important for the occurrence and abundance of some forage fish species. Filament ridges are also associated with foraging behavior of top predators, including seabirds, sharks, and marine mammals that are likely feeding on forage fishes and other prey that aggregate to these features (Abrahms et al., 2018;Cotté et al., 2011;Della Penna et al., 2015;Kai et al., 2009). Productivity and SST frontal features have also been associated with higher abundance of zooplankton (Genin et al., 2005), forage fishes (Friedland et al., 2020), and top predators (Scales, Miller, Hawkes, et al., 2014), supporting the relationships found in our community models.
Sea surface temperature and depth were important predictors for nearly the entire forage fish community but had a weaker influence than the submesoscale eddies and front persistence. SST is well established as a regulating factor in the distribution of pelagic fishes via physiological thermal niche constraints (Ma et al., 2022), and top predator distributions are known to be associated with SST patterns, as they track prey distributions (Hazen et al., 2013).
SST and depth were also consistently predictive of forage fish distributions in single-species modeling frameworks (Friedland et al., 2020;Holland et al., 2021;Suca, Deroba, et al., 2021). Thus, our models provide further evidence that SST and depth gradients are important for structuring the community distribution of forage fishes. 1.17 ± 0.14 1.17 ± 0.10 0 ± 0.00 0.00 Note: Species richness was calculated for each community type as the mean ± SD of the summed occupancy for all grid cells classified to that type. FFA abundance metrics were calculated for the community types represented in the FFA study area. FFA abundance is the estimated number (mean ± SD) of FFA spatially overlapping each community type. FFA density is calculated as the number of FFA per km 2 for each community type.
TA B L E 5 Species richness, forage fish aggregation (FFA) abundance, and FFA density by community type across the US Northeast Continental Shelf (NES) and FFA study areas.

| Aggregating behavior
The complex interplay between MLD and the benthic terrain combined with behavioral mechanisms driven by foraging needs and predator avoidance may be associated with the spatial patterns we found in FFA abundance and size. In the NES, MLD is a temporal indicator of seasonal stratification of the water column with weak stratification and deeper MLD during the winter due to strong upper ocean mixing, contrasting with strong stratification and a shallow MLD during the summer (Cai et al., 2021;Li et al., 2015). Seasonal spatial variation in MLD reflects differences in subsurface temperature and salinity gradients (i.e., thermo-and haloclines; collectively, pycnoclines) across the shelf due to localized bathymetry and coastal processes such as freshwater inputs and tidal mixing (Cai et al., 2021;Li et al., 2015). However, the direction of this relationship differed with shallower MLD (stronger stratification) associated with more FFA, and deeper MLD (weaker stratification) associated with larger FFA.
Sharp pycnoclines due to strong stratification are associated with high subsurface productivity (Weston et al., 2005), which is, in turn, linked to increased aggregations of zooplankton (Genin, 2004), driving higher abundance of FFA, but smaller individual FFA size.
Additionally, links have been found between stratified areas where abrupt changes in topography, such as steep depressions (i.e., negative BPI values), cause internal waves that drive large aggregations of forage fish to the surface (Cox et al., 2018). These conditions are commonly found at the shelf edge or offshore banks, resulting in local upwelling that depresses MLD, allowing resources to disperse and larger FFAs to form.
The regions of highest predicted FFA abundance for our models were also coincident with the relatively shallow depths associated with the mouths of major freshwater inputs, such as the Chesapeake and Delaware Bays, suggesting that in the Mid-Atlantic and New York Bights, shallow habitat may function both as convergence zone (i.e., a mixing zone between water masses and fine-scale tidal currents) and refugia from predation (Litz et al., 2014). Moreover, on continental shelves like the NES ecosystem, shallow depth may drive zooplankton aggregation formation by blocking diel migration back to deeper waters (i.e., topographic blockage, Isaacs & Schwartzlose, 1965). Topographic blocking traps those planktonic aggregations in shallower regions, exposing them to forage fish predation, and, subsequently, resulting in FFA formation via trophic focusing (Genin, 2004). However, BPI, a measure of the benthic topography, was a more important predictor than depth for FFA size.
In both models, shallower depths were an indication of more and bigger FFA, while larger FFAs were associated with extremely negative BPI, indicating abrupt, benthic depressions or valley bottoms (Lundblad et al., 2006). This connects back to the aforementioned relationship with MLD, where stratified regions interact with abrupt benthic depressions to spur the formation of large FFAs.

| Contrasting drivers of distributions and aggregations
While FFA abundance may overlap spatially with community species richness, these patterns are driven by different oceanographic processes. MLD, a subsurface dynamic variable, was important to FFA abundance and size. However, MLD was not important at the community occupancy level with a low effect size in both autumn and spring. In contrast, the environmental drivers with the largest influence on community-level occupancy were dynamic surface processes: eddies and frontal features. The differences in the oceanographic processes driving community occupancy versus FFA abundance are likely due to the aggregation dataset inherently including behavioral information (i.e., surface schooling behavior), which is absent from the occupancy dataset. Subsurface features describing the vertical water column may be more tightly linked with behavioral processes related to depth (i.e., diel migration, surface aggregation formation). Relatedly, differences in oceanographic drivers may also reflect that FFAs are nested hierarchically within community occupancy, such that FFAs represent finer-scale structures within the larger community distribution (Fauchald et al., 2000).
Conversely, depth, a static habitat feature, was important to all levels of organization (species occupancy, community occupancy, and FFA abundance/size) and has been an important predictor of forage fish abundance in multiple studies of the NES ecosystem (Friedland et al., 2019;Suca, Deroba, et al., 2021). This finding suggests that some static habitat features may influence forage fishes' spatial distribution regardless of scale or may indicate that depth integrates several important processes in one measure.

| Implications for ecosystem change
The FFA patterns and community dynamics described by our models are reliant upon relatively recent historical data (FFA: 2012, community SDM: 1997. Due to climate change, the NES is experiencing rapid warming at three times the global average (Pershing et al., 2021) and increased frequency of marine heatwaves (Laufkötter et al., 2020). Concurrent decreases in surface salinity combined with rising temperatures are expected to drive increased seasonal stratification (i.e., MLD; Pershing et al., 2021), which, based on our findings, has the potential to affect the distribution, abundance, and aggregating behavior of forage fishes in this system. Climate-induced warming has already induced detectable broadscale and seasonal distribution shifts across the trophic web in the NES, from plankton (Chust et al., 2014), fish, and macroinvertebrates (Friedland et al., 2020) to predatory fishes (Muhling et al., 2017) and marine mammals (Pendleton et al., 2022), and is expected to influence the distributions of key forage fishes included in this study, such as sand lance, herring, and menhaden Staudinger et al., 2020;Suca, Wiley, et al., 2021). These distribution shifts are not expected to occur symmetrically and may not result in wholescale northward shifts of the present community.
Instead, these changes could result in the development of novel, noanalog communities (i.e., community composition unlike that known from the historical or paleontological record), leading to changes in community relationships among forage fish species and affecting spatiotemporal patterns in aggregating behavior (Williams & Jackson, 2007). Moreover, climate-induced warming can affect the abundance, size, quality (i.e., lipid content), and intraguild dynamics of forage fishes, disrupting trophic energy transfer to higher trophic level predators and fueling mass mortality events of predators (Arimitsu et al., 2021). Our results highlight the need for additional research into the effects of climate change on subsurface dynamic processes and how those effects may impact trophic interactions.
Additionally, the imminent development of offshore wind energy (i.e., the construction of large-scale offshore windfarms) in the NES may contribute to meso-and submesoscale changes in localized current patterns circulation and subsurface dynamics, such as stratification (Christiansen et al., 2022;Dorrell et al., 2022). Although a recent study shows that wind energy lease areas overlap considerably with the core habitat of forage fish species , it is unknown how these habitat alterations may influence forage fish aggregating behavior in the NES (but see Raoux et al., 2017), and, thus, realized prey availability. Unanticipated synergistic interactions between climate change effects, offshore wind energy development, and other anthropogenic stressors, such as pollution and commercial and recreational fisheries, could further alter patterns of forage fish availability across the NES shelf.

| Limitations and sources of bias
The NOAA bottom trawl surveys are not designed to sample pelagic or surface schooling species, and there may be differential catchability rates among the forage fishes, as well as biases in size selectivity due to gear design. Moreover, water depth influences the probability of capture of forage fishes by bottom trawls, since midwater forage fishes may be more available to the gear over shallower bottoms (<50 m) compared to deeper waters where they are more likely to be captured only on the deployment and recovery of the gear. However, while these species are defined as pelagic, many do use the entire water column on the continental shelf via diel migration, predator avoidance, foraging, and spawning behaviors (Freon & Misund, 1999). In addition, there was a significant gear change during our study period (2009) that led to notable catch changes for many forage fish species, especially sand lance (Miller et al., 2010). After the gear change, the trawl catch of sand lance has been considered unreliable for the purposes of abundance monitoring (Richardson et al., 2014); however, it has been deemed adequate for presence-absence occurrence models that span that period (Friedland et al., 2020) Due to variations in vertical distribution patterns of schools, FFAs may be more detectable at shallower depths, while seasonal behaviors of some schooling species, such as shifts to deeper water, may also limit FFA detection by aerial surveys (Freon & Misund, 1999).
To address these detection biases, we limited our interpretation to surface FFA patterns, acknowledging that the data does not sample all FFA in the water column. In addition, the aggregation models were limited by the inability to identify FFA composition at specieslevel and by the aerial survey sampling frequency, preventing us from integrating the datasets into a combined model for forage fish species and aggregations. Thus, we were reliant upon post-hoc comparisons of our predicted distributions. Finally, FFA occurrence is a highly ephemeral process, occurring over spatiotemporal scales smaller than the 4 km-, daily-scale oceanographic data we used in our models, or the 4-8 aerial surveys/year conducted in this study.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare no conflict of interest.  (Cayula & Cornillon, 1992). For the detection of SST fronts, we used a 0.4°C temperature threshold (Cayula & Cornillon, 1992). To increase detection of coastal and smaller scale fronts, we adjusted the default tool settings to a 16 × 16 pixel window, a 4 window stride, and a 5 × 5 kernel (Roa-Pascuali et al., 2015). We also adjusted the spatial cohesion parameters to reflect the smaller histogram window: 0.87 minimum single population spatial cohesion and 0.88 minimum global population spatial cohesion (Cayula & Cornillon, 1992). For the detection of Chla fronts, we optimized the CCA parameters to better detect coastal fronts, using a 0.4 mg/m 3 threshold. As with the SST fronts, we adjusted the parameters as follows: a 16 × 16 pixel window, a 3 window stride and a 5 × 5 kernel, changing the minimum single and global population cohesion values accordingly. We also set the minimum criterion function to 0.74 to allow "curvier" fronts to be identified (Cayula & Cornillon, 1992). We used the thin option to ensure one-pixel-width fronts.

DATA AVA I L A B I L I T Y S TAT E M E N T
Frontal gradients were calculated for SST and Chla using the Belkin O'Reilly gradient algorithm (Belkin & O'Reilly, 2009) with the detectFronts function in the grec package in R (version 1.4.1). Then, all detected fronts and calculated gradients over a 7-day moving window were combined into daily composite frontal maps (Scales, Miller, Embling, et al., 2014), and used to calculate two frontal metrics: Fprob and Fmean (Miller, 2009;Suberg et al., 2019) for each day of the study period . Fprob is a measure of front persistence and is calculated as the probability of a front being detected each day over the rolling 7-day window. Fmean is a measure of front intensity and is calculated as the average of the frontal gradient within the detected fronts.

TA B L E A 3
Estimated prevalence (mean probability of occurrence) of species within distinct forage fish communities for the (a) autumn and (b) spring community models.

F I G U R E A 3
Residual species-to-species associations (i.e., species which co-occur more or less than expected based on species' niches) in the (a, b) autumn and (c, d) spring community models. Panels (a) and (c) are due to the temporal random effect, while panels (b) and (d) are due to the tow random effect. Orange and blue indicate species pairs with at least 0.95 posterior support for a positive or negative association, respectively. The intensity of color and size of the marker indicates the strength of the association (in units of correlation). Species codes are defined in Table 1.